346 research outputs found
On Content-centric Wireless Delivery Networks
The flux of social media and the convenience of mobile connectivity has
created a mobile data phenomenon that is expected to overwhelm the mobile
cellular networks in the foreseeable future. Despite the advent of 4G/LTE, the
growth rate of wireless data has far exceeded the capacity increase of the
mobile networks. A fundamentally new design paradigm is required to tackle the
ever-growing wireless data challenge.
In this article, we investigate the problem of massive content delivery over
wireless networks and present a systematic view on content-centric network
design and its underlying challenges. Towards this end, we first review some of
the recent advancements in Information Centric Networking (ICN) which provides
the basis on how media contents can be labeled, distributed, and placed across
the networks. We then formulate the content delivery task into a content rate
maximization problem over a share wireless channel, which, contrasting the
conventional wisdom that attempts to increase the bit-rate of a unicast system,
maximizes the content delivery capability with a fixed amount of wireless
resources. This conceptually simple change enables us to exploit the "content
diversity" and the "network diversity" by leveraging the abundant computation
sources (through application-layer encoding, pushing and caching, etc.) within
the existing wireless networks. A network architecture that enables wireless
network crowdsourcing for content delivery is then described, followed by an
exemplary campus wireless network that encompasses the above concepts.Comment: 20 pages, 7 figures,accepted by IEEE Wireless
Communications,Sept.201
Asymmetric Polynomial Loss For Multi-Label Classification
Various tasks are reformulated as multi-label classification problems, in
which the binary cross-entropy (BCE) loss is frequently utilized for optimizing
well-designed models. However, the vanilla BCE loss cannot be tailored for
diverse tasks, resulting in a suboptimal performance for different models.
Besides, the imbalance between redundant negative samples and rare positive
samples could degrade the model performance. In this paper, we propose an
effective Asymmetric Polynomial Loss (APL) to mitigate the above issues.
Specifically, we first perform Taylor expansion on BCE loss. Then we ameliorate
the coefficients of polynomial functions. We further employ the asymmetric
focusing mechanism to decouple the gradient contribution from the negative and
positive samples. Moreover, we validate that the polynomial coefficients can
recalibrate the asymmetric focusing hyperparameters. Experiments on relation
extraction, text classification, and image classification show that our APL
loss can consistently improve performance without extra training burden.Comment: ICASSP 202
Human-Readable Fingerprint for Large Language Models
Protecting the copyright of large language models (LLMs) has become crucial
due to their resource-intensive training and accompanying carefully designed
licenses. However, identifying the original base model of an LLM is challenging
due to potential parameter alterations. In this study, we introduce a
human-readable fingerprint for LLMs that uniquely identifies the base model
without exposing model parameters or interfering with training. We first
observe that the vector direction of LLM parameters remains stable after the
model has converged during pretraining, showing negligible perturbations
through subsequent training steps, including continued pretraining, supervised
fine-tuning (SFT), and RLHF, which makes it a sufficient condition to identify
the base model. The necessity is validated by continuing to train an LLM with
an extra term to drive away the model parameters' direction and the model
becomes damaged. However, this direction is vulnerable to simple attacks like
dimension permutation or matrix rotation, which significantly change it without
affecting performance. To address this, leveraging the Transformer structure,
we systematically analyze potential attacks and define three invariant terms
that identify an LLM's base model. We make these invariant terms human-readable
by mapping them to a Gaussian vector using a convolutional encoder and then
converting it into a natural image with StyleGAN2. Our method generates a dog
image as an identity fingerprint for an LLM, where the dog's appearance
strongly indicates the LLM's base model. The fingerprint provides intuitive
information for qualitative discrimination, while the invariant terms can be
employed for quantitative and precise verification. Experimental results across
various LLMs demonstrate the effectiveness of our method
- …